806 research outputs found

    A weighted multiple classifier framework based on random projection.

    Get PDF
    In this paper, we propose a weighted multiple classifier framework based on random projections. Similar to the mechanism of other homogeneous ensemble methods, the base classifiers in our approach are obtained by a learning algorithm on different training sets generated by projecting the original up-space training set to lower dimensional down-spaces. We then apply a Least SquarE−based method to weigh the outputs of the base classifiers so that the contribution of each classifier to the final combined prediction is different. We choose Decision Tree as the learning algorithm in the proposed framework and conduct experiments on a number of real and synthetic datasets. The experimental results indicate that our framework is better than many of the benchmark algorithms, including three homogeneous ensemble methods (Bagging, RotBoost, and Random Subspace), several well-known algorithms (Decision Tree, Random Neural Network, Linear Discriminative Analysis, K Nearest Neighbor, L2-loss Linear Support Vector Machine, and Discriminative Restricted Boltzmann Machine), and random projection-based ensembles with fixed combining rules with regard to both classification error rates and F1 scores

    Multi-label classification via incremental clustering on an evolving data stream.

    Get PDF
    With the advancement of storage and processing technology, an enormous amount of data is collected on a daily basis in many applications. Nowadays, advanced data analytics have been used to mine the collected data for useful information and make predictions, contributing to the competitive advantages of companies. The increasing data volume, however, has posed many problems to classical batch learning systems, such as the need to retrain the model completely with the newly arrived samples or the impracticality of storing and accessing a large volume of data. This has prompted interest on incremental learning that operates on data streams. In this study, we develop an incremental online multi-label classification (OMLC) method based on a weighted clustering model. The model is made to adapt to the change of data via the decay mechanism in which each sample's weight dwindles away over time. The clustering model therefore always focuses more on newly arrived samples. In the classification process, only clusters whose weights are greater than a threshold (called mature clusters) are employed to assign labels for the samples. In our method, not only is the clustering model incrementally maintained with the revealed ground truth labels of the arrived samples, the number of predicted labels in a sample are also adjusted based on the Hoeffding inequality and the label cardinality. The experimental results show that our method is competitive compared to several well-known benchmark algorithms on six performance measures in both the stationary and the concept drift settings

    Evolving interval-based representation for multiple classifier fusion.

    Get PDF
    Designing an ensemble of classifiers is one of the popular research topics in machine learning since it can give better results than using each constituent member. Furthermore, the performance of ensemble can be improved using selection or adaptation. In the former, the optimal set of base classifiers, meta-classifier, original features, or meta-data is selected to obtain a better ensemble than using the entire classifiers and features. In the latter, the base classifiers or combining algorithms working on the outputs of the base classifiers are made to adapt to a particular problem. The adaptation here means that the parameters of these algorithms are trained to be optimal for each problem. In this study, we propose a novel evolving combining algorithm using the adaptation approach for the ensemble systems. Instead of using numerical value when computing the representation for each class, we propose to use the interval-based representation for the class. The optimal value of the representation is found through Particle Swarm Optimization. During classification, a test instance is assigned to the class with the interval-based representation that is closest to the base classifiers’ prediction. Experiments conducted on a number of popular dataset confirmed that the proposed method is better than the well-known ensemble systems using Decision Template and Sum Rule as combiner, L2-loss Linear Support Vector Machine, Multiple Layer Neural Network, and the ensemble selection methods based on GA-Meta-data, META-DES, and ACO

    Multi-layer heterogeneous ensemble with classifier and feature selection.

    Get PDF
    Deep Neural Networks have achieved many successes when applying to visual, text, and speech information in various domains. The crucial reasons behind these successes are the multi-layer architecture and the in-model feature transformation of deep learning models. These design principles have inspired other sub-fields of machine learning including ensemble learning. In recent years, there are some deep homogenous ensemble models introduced with a large number of classifiers in each layer. These models, thus, require a costly computational classification. Moreover, the existing deep ensemble models use all classifiers including unnecessary ones which can reduce the predictive accuracy of the ensemble. In this study, we propose a multi-layer ensemble learning framework called MUlti-Layer heterogeneous Ensemble System (MULES) to solve the classification problem. The proposed system works with a small number of heterogeneous classifiers to obtain ensemble diversity, therefore being efficiency in resource usage. We also propose an Evolutionary Algorithm-based selection method to select the subset of suitable classifiers and features at each layer to enhance the predictive performance of MULES. The selection method uses NSGA-II algorithm to optimize two objectives concerning classification accuracy and ensemble diversity. Experiments on 33 datasets confirm that MULES is better than a number of well-known benchmark algorithms

    Deep heterogeneous ensemble.

    Get PDF
    In recent years, deep neural networks (DNNs) have emerged as a powerful technique in many areas of machine learning. Although DNNs have achieved great breakthrough in processing images, video, audio and text, it also has some limitations such as needing a large number of labeled data for training and having a large number of parameters. Ensemble learning, meanwhile, provides a learning model by combining many different classifiers such that an ensemble of classifiers is better than using single classifier. In this study, we propose a deep ensemble framework called Deep Heterogeneous Ensemble (DHE) for supervised learning tasks. In each layer of our algorithm, the input data is passed through a feature selection method to remove irrelevant features and prevent overfitting. The cross-validation with K learning algorithms is applied to the selected data, in order to obtain the meta-data and the K base classifiers for the next layer. In this way, one layer will output the meta-data as the input data for the next layer, the base classifiers, and the indices of the selected meta-data. A combining algorithm is then applied on the meta-data of the last layer to obtain the final class prediction. Experiments on 30 datasets confirm that the proposed DHE is better than a number of well-known benchmark algorithms

    The institutional experience of the implementing 4DCT in NSCLC radiotherapy planning

    Get PDF
    Background: The study was to evaluate the effectiveness of dose distribution of four-dimensional computed tomography (4DCT) simulation. Materials and methods: The gross tumor volume (GTV) and clinical target volume (CTV) were contoured in all 10 respiratory phases of 4DCT in 30 patients with non-small cell lung cancer (NSCLC). Both 3D and 4D treatment plans were made individually for each patient using the planning volume (PTV). The PTV3D was taken from a single CTV plus the recommended margin, and the PTV4D was taken from the 4D internal target volume, including all 10 CTVs plus the setup margins. Results The mean PTV was 460 ± 179 (69–820) cm3 for 3DCT and 401 ± 167 (127–854) cm3 for 4DCT (p = 0.0018). The dose distribution (DD) of organs at risk, especially the lungs, was lower for the 4DCT simulation. The V5%, V10%, and V20% of the total lung dose for 4DCT were significantly lower for the 3DCT. However, lung V30% the heart, esophagus, and spinal cord were not significantly different. In addition, the conformity index and the dose heterogeneity index of the PTV were not significantly different. The normal tissue complication probability (NTCP) of the lung and heart was significantly lower for 4DCT than for 3DCT. Conclusions: The 4DCT simulation gives better results on the NTCP. The organs at risk, especially the lungs, receive a significantly lower DD compared with the 3DCT. The conformity index (CI), heterogeneity index (HI) and the DD to the heart, spinal cord, and esophagus were not significantly different between the two techniques

    Measurement of the cross-section and charge asymmetry of WW bosons produced in proton-proton collisions at s=8\sqrt{s}=8 TeV with the ATLAS detector

    Get PDF
    This paper presents measurements of the W+μ+νW^+ \rightarrow \mu^+\nu and WμνW^- \rightarrow \mu^-\nu cross-sections and the associated charge asymmetry as a function of the absolute pseudorapidity of the decay muon. The data were collected in proton--proton collisions at a centre-of-mass energy of 8 TeV with the ATLAS experiment at the LHC and correspond to a total integrated luminosity of 20.2~\mbox{fb^{-1}}. The precision of the cross-section measurements varies between 0.8% to 1.5% as a function of the pseudorapidity, excluding the 1.9% uncertainty on the integrated luminosity. The charge asymmetry is measured with an uncertainty between 0.002 and 0.003. The results are compared with predictions based on next-to-next-to-leading-order calculations with various parton distribution functions and have the sensitivity to discriminate between them.Comment: 38 pages in total, author list starting page 22, 5 figures, 4 tables, submitted to EPJC. All figures including auxiliary figures are available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/STDM-2017-13

    Search for chargino-neutralino production with mass splittings near the electroweak scale in three-lepton final states in √s=13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for supersymmetry through the pair production of electroweakinos with mass splittings near the electroweak scale and decaying via on-shell W and Z bosons is presented for a three-lepton final state. The analyzed proton-proton collision data taken at a center-of-mass energy of √s=13  TeV were collected between 2015 and 2018 by the ATLAS experiment at the Large Hadron Collider, corresponding to an integrated luminosity of 139  fb−1. A search, emulating the recursive jigsaw reconstruction technique with easily reproducible laboratory-frame variables, is performed. The two excesses observed in the 2015–2016 data recursive jigsaw analysis in the low-mass three-lepton phase space are reproduced. Results with the full data set are in agreement with the Standard Model expectations. They are interpreted to set exclusion limits at the 95% confidence level on simplified models of chargino-neutralino pair production for masses up to 345 GeV

    Measurement of the W±Z boson pair-production cross section in pp collisions at √s=13TeV with the ATLAS detector

    Get PDF
    published_or_final_versio
    corecore